Search CORE

Correcting the Bias of Empirical Frequency Parameter Estimators in Codon Models

Author: C Kosiol
C Seoighe
G Schwarz
GC Conant
Konrad Scheffler
M Anisimova
M Lacerda
N Goldman
S Whelan
Sergei Kosakovsky Pond
SL Kosakovsky Pond
SL Kosakovsky Pond
SL Kosakovsky Pond
Spencer V. Muse
SV Muse
Thomas Mailund
W Delport
Wayne Delport
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Markov models of codon substitution are powerful inferential tools for studying biological processes such as natural selection and preferences in amino acid substitution. The equilibrium character distributions of these models are almost always estimated using nucleotide frequencies observed in a sequence alignment, primarily as a matter of historical convention. In this note, we demonstrate that a popular class of such estimators are biased, and that this bias has an adverse effect on goodness of fit and estimates of substitution rates. We propose a “corrected” empirical estimator that begins with observed nucleotide counts, but accounts for the nucleotide composition of stop codons. We show via simulation that the corrected estimates outperform the de facto standard estimates not just by providing better estimates of the frequencies themselves, but also by leading to improved estimation of other parameters in the evolutionary models. On a curated collection of sequence alignments, our estimators show a significant improvement in goodness of fit compared to the approach. Maximum likelihood estimation of the frequency parameters appears to be warranted in many cases, albeit at a greater computational cost. Our results demonstrate that there is little justification, either statistical or computational, for continued use of the -style estimators

CiteSeerX

Stellenbosch University SUNScholar Repository

Chronic defensiveness and neuroendocrine dysfunction reflect a novel cardiac troponin T cut point: The SABPA study.

Author: Gavin W. Lambert (3636757)
Hendrik Stefanus Steyn (7244300)
Leone Malan (7242182)
Mark Hamer (1254141)
Nico T. Malan (7243715)
Rhena Delport (7244297)
Roland von Kanel (7243703)
Publication venue
Publication date: 01/01/2017
Field of study

Background: Sympatho-adrenal responses are activated as an innate defense coping (DefS) mechanism during emotional stress. Whether these sympatho-adrenal responses drive cardiac troponin T (cTnT) increases are unknown. Therefore, associations between cTnT and sympatho-adrenal responses were assessed. Methods: A prospective bi-ethnic cohort, excluding atrial fibrillation, myocardial infarction and stroke cases, was followed for 3 years (N=342; 45.6±9.0 years). We obtained serum high-sensitive cTnT and outcome measures [Coping-Strategy-Indicator, depression/Patient-Health-Questionnarie-9, 24h BP, 24h heart-rate-variability (HRV) and 24h urinary catecholamines]. Results: cTnT levels of the cohort remained similar over 3 years but recovery to cTnT-negative levels was higher in Blacks. Blacks showed moderate depression (45% vs. 16%) and 24h hypertension (67% vs. 42%) prevalence compared to Whites. A receiver-operating-characteristics cTnT cut-point 4.2 ng/L predicting hypertension in Blacks was used as binary exposure measure in relation to outcome measures [AUC 0.68 (95% CI 0.60-0.76); sensitivity/specificity 63/70%; P≤0.001]. In cross-sectional analyses, elevated cTnT was related to DefS [OR 1.08 (95% CI 0.99-1.16); P=0.06]; 24h BP [OR 1.03-1.04 (95% CI 1.01-1.08); P≤0.02] and depressed HRV [OR 2.19 (95% CI 1.09-4.41); P=0.03] in Blacks, but not in Whites. At 3 year follow-up, elevated cTnT was related to attenuated urine norepinephrine:creatinine ratio in Blacks [OR 1.46 (95% CI 1.01-2.10); P=0.04]. In Whites, a cut point of 5.6 ng/L cTnT predicting hypertension was not associated with outcome measures. Conclusion: Central neural control systems exemplified a brain-heart stress pathway. Desensitization of sympatho-adrenal responses occurred with initial neural- (HRV) followed by neuroendocrine dysfunction (norepinephrine:creatinine) in relation to elevated cTnT. Chronic defensiveness may thus drive the desensitization or physiological depression, reflecting ischemic heart disease risk at a 4.2 ng/L cTnT cut-point in Blacks

Loughborough University Institutional Repository

Experimental evidence indicating that mastreviruses probably did not co-diverge with their hosts

Author: Briddon Rob W
Delport Wayne
Donaldson Lara
Duffy Siobain
Harkins Gordon W
Martin Darren P
Monjane Adérito L
Owor Betty E
Rybicki Edward P
Saumtally Salem
Shepherd Dionne N
Triton Guy
Varsani Arvind
Wood Natasha
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background. Despite the demonstration that geminiviruses, like many other single stranded DNA viruses, are evolving at rates similar to those of RNA viruses, a recent study has suggested that grass-infecting species in the genus Mastrevirus may have co-diverged with their hosts over millions of years. This "co-divergence hypothesis" requires that long-term mastrevirus substitution rates be at least 100,000-fold lower than their basal mutation rates and 10,000-fold lower than their observable short-term substitution rates. The credibility of this hypothesis, therefore, hinges on the testable claim that negative selection during mastrevirus evolution is so potent that it effectively purges 99.999% of all mutations that occur. Results. We have conducted long-term evolution experiments lasting between 6 and 32 years, where we have determined substitution rates of between 2 and 3 × 10 -4substitutions/site/year for the mastreviruses Maize streak virus (MSV) and Sugarcane streak Réunion virus (SSRV). We further show that mutation biases are similar for different geminivirus genera, suggesting that mutational processes that drive high basal mutation rates are conserved across the family. Rather than displaying signs of extremely severe negative selection as implied by the co-divergence hypothesis, our evolution experiments indicate that MSV and SSRV are predominantly evolving under neutral genetic drift. Conclusion. The absence of strong negative selection signals within our evolution experiments and the uniformly high geminivirus substitution rates that we and others have reported suggest that mastreviruses cannot have co-diverged with their hosts. © 2009 Harkins et al; licensee BioMed Central Ltd

Cape Town University OpenUCT

Springer - Publisher Connector

UC Research Repository

Queensland University of Technology ePrints Archive

National Research Foundation

Evolutionary distances in the twilight zone -- a rational kernel approach

Author: A Keller
A Löytynoja
A Stamatakis
B Chor
B Schölkopf
Benjamin Merget
C Cortes
C Daskalakis
CB Do
E Rivas
F Bemm
Florian Markowetz
Frank Förster
G Talavera
HH Otu
I Ulitsky
J Felsenstein
J Friedrich
J Hein
JL Thorne
JL Thorne
Jörg Schultz
KM Wong
LS Wang
M Höhl
M Höhl
M Mohri
M Mohri
M Wolf
MA Buchheim
MA Suchard
Matthias Wolf
MJ Bishop
MK Kuhner
MS Waterman
N Goldman
N Higham
R Durbin
RC Edgar
RF Doolittle
Roland F. Schwarz
S Roch
S Whelan
SR Eddy
T Mailund
T Müller
TH Ogden
V Levenshtein
W Fletcher
W Fletcher
Wayne Delport
William Fletcher
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 23/11/2010
Field of study

Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.Comment: to appear in PLoS ON

arXiv.org e-Print Archive

UPSpace at the University of Pretoria

MDC Repository

Mitochondrial DNA control region data from indigenous Angolan Khoe-San lineages

Author: Bodner Martin
Delport Rhena
Fendt Liane
Huber Gabriela
Parson Walther
Rock Alexander W.
Schmidt Konrad
Zimmermann Bettina
Publication venue: 'Elsevier BV'
Publication date: 01/09/2012
Field of study

Here we provide 129 complete mitochondrial control region sequences of indigenous Khoe-San individuals from Angola to contribute to the still underrepresented pool of data from Africa. The dataset consists of exclusively African lineages with a majority of Sub-Saharan haplogroups. The probability of a random match was calculated as 0.09. The data set comprises 21 haplotypes occurring more than once and 17 unique haplotypes. Upon publication, haplotypes were incorporated in the EMPOP database (www.empop.org; EMP00069) [1].http://www.elsevier.com/locate/fsi

CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences

Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes

Cronfa at Swansea University

Stellenbosch University SUNScholar Repository

Molecular evolution of HoxA13 and the multiple origins of limbless morphologies in amphibians and reptiles

Developmental processes and their results, morphological characters, are inherited through transmission of genes regulating development. While there is ample evidence that cis-regulatory elements tend to be modular, with sequence segments dedicated to different roles, the situation for proteins is less clear, being particularly complex for transcription factors with multiple functions. Some motifs mediating protein-protein interactions may be exclusive to particular developmental roles, but it is also possible that motifs are mostly shared among different processes. Here we focus on HoxA13, a protein essential for limb development. We asked whether the HoxA13 amino acid sequence evolved similarly in three limbless clades: Gymnophiona, Amphisbaenia and Serpentes. We explored variation in ω (dN/dS) using a maximum-likelihood framework and HoxA13sequences from 47 species. Comparisons of evolutionary models provided low ω global values and no evidence that HoxA13 experienced relaxed selection in limbless clades. Branch-site models failed to detect evidence for positive selection acting on any site along branches of Amphisbaena and Gymnophiona, while three sites were identified in Serpentes. Examination of alignments did not reveal consistent sequence differences between limbed and limbless species. We conclude that HoxA13 has no modules exclusive to limb development, which may be explained by its involvement in multiple developmental processes

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

University of Dundee Online Publications

Repositorio da Producao Cientifica e Intelectual da Unicamp

Recent acquisition of Helicobacter pylori by Baka Pygmies

Author: AB Migliano
Armand Nkwescheu
Ayas Maady
B Linz
B Pakendorf
Bodo Linz
C Batini
C Ehret
CM Schlebusch
CY Tay
D Falush
D Falush
D Schoenbrun
Daniel Eibach
Daniel Falush
DD Palmer
DM Behar
DM Magalhaes Queiroz
DY Graham
E Patin
G Brusotti
G Hellenthal
G Jobb
G Morelli
Gil McVean
JC Assob
JG Kusters
Jose Siri
JZ Li
KA Jolley
L Excoffier
L Naduvilezhath
L Quintana-Murci
LA Zhivotovsky
M Achtman
M Guisset
M Jakobsson
M Kawai
Mark Achtman
MC Campbell
MF Rupnow
MF Rupnow
MF Rupnow
MJ Hamilton
MK Gonder
NA Drake
NA Rosenberg
NS Becker
P Librado
RE Pounder
RN Ndip
RN Ndip
RR Hudson
RR Hudson
RW Frenck Jr
S Breurec
S Latifi-Navid
S Schwarz
SA Tishkoff
Sandra Nell
SC Schuster
Sebastian Suerbaum
T Wirth
TH Farag
V Kuete
V Montano
Valeria Montano
W Delport
Wael F. Elamin
X Didelot
Y Chen
Y Moodley
Y Moodley
Yoshan Moodley
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Both anatomically modern humans and the gastric pathogen Helicobacter pylori originated in Africa, and both species have been associated for at least 100,000 years. Seven geographically distinct H. pylori populations exist, three of which are indigenous to Africa: hpAfrica1, hpAfrica2, and hpNEAfrica. The oldest and most divergent population, hpAfrica2, evolved within San hunter-gatherers, who represent one of the deepest branches of the human population tree. Anticipating the presence of ancient H. pylori lineages within all hunter-gatherer populations, we investigated the prevalence and population structure of H. pylori within Baka Pygmies in Cameroon. Gastric biopsies were obtained by esophagogastroduodenoscopy from 77 Baka from two geographically separated populations, and from 101 non-Baka individuals from neighboring agriculturalist populations, and subsequently cultured for H. pylori. Unexpectedly, Baka Pygmies showed a significantly lower H. pylori infection rate (20.8%) than non-Baka (80.2%). We generated multilocus haplotypes for each H. pylori isolate by DNA sequencing, but were not able to identify Baka-specific lineages, and most isolates in our sample were assigned to hpNEAfrica or hpAfrica1. The population hpNEAfrica, a marker for the expansion of the Nilo-Saharan language family, was divided into East African and Central West African subpopulations. Similarly, a new hpAfrica1 subpopulation, identified mainly among Cameroonians, supports eastern and western expansions of Bantu languages. An age-structured transmission model shows that the low H. pylori prevalence among Baka Pygmies is achievable within the timeframe of a few hundred years and suggests that demographic factors such as small population size and unusually low life expectancy can lead to the eradication of H. pylori from individual human populations. The Baka were thus either H. pylori-free or lost their ancient lineages during past demographic fluctuations. Using coalescent simulations and phylogenetic inference, we show that Baka almost certainly acquired their extant H. pylori through secondary contact with their agriculturalist neighbors

Serveur académique lausannois

Irish Universities

Warwick Research Archives Portal Repository

Cork Open Research Archive

MPG.PuRe

University of St. Andrews - Pure

International Institute for Applied Systems Analysis (IIASA)

FigShare

Modeling HIV-1 Drug Resistance as Episodic Directional Selection

The evolution of substitutions conferring drug resistance to HIV-1 is both episodic, occurring when patients are on antiretroviral therapy, and strongly directional, with site-specific resistant residues increasing in frequency over time. While methods exist to detect episodic diversifying selection and continuous directional selection, no evolutionary model combining these two properties has been proposed. We present two models of episodic directional selection (MEDS and EDEPS) which allow the a priori specification of lineages expected to have undergone directional selection. The models infer the sites and target residues that were likely subject to directional selection, using either codon or protein sequences. Compared to its null model of episodic diversifying selection, MEDS provides a superior fit to most sites known to be involved in drug resistance, and neither one test for episodic diversifying selection nor another for constant directional selection are able to detect as many true positives as MEDS and EDEPS while maintaining acceptable levels of false positives. This suggests that episodic directional selection is a better description of the process driving the evolution of drug resistance